The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Recent advances in computer vision have shown promising results in image generation. Diffusion probabilistic models in particular have generated realistic images from textual input, as demonstrated by DALL-E 2, Imagen and Stable Diffusion. However, their use in medicine, where image data typically comprises three-dimensional volumes, has not been systematically evaluated. Synthetic images may play a crucial role in privacy preserving artificial intelligence and can also be used to augment small datasets. Here we show that diffusion probabilistic models can synthesize high quality medical imaging data, which we show for Magnetic Resonance Images (MRI) and Computed Tomography (CT) images. We provide quantitative measurements of their performance through a reader study with two medical experts who rated the quality of the synthesized images in three categories: Realistic image appearance, anatomical correctness and consistency between slices. Furthermore, we demonstrate that synthetic images can be used in a self-supervised pre-training and improve the performance of breast segmentation models when data is scarce (dice score 0.91 vs. 0.95 without vs. with synthetic data).
translated by 谷歌翻译
心脏磁共振(CMR)序列随着时间的推移可视化心脏功能的体素。同时,基于深度学习的可变形图像注册能够估计离散的向量字段,这些矢量字段将CMR序列的一个时间步骤扭曲为以下方式,以一种自我监督的方式。但是,尽管这些3D+T向量领域中包含的信息来源丰富,但标准化的解释具有挑战性,到目前为止,临床应用仍然有限。在这项工作中,我们展示了如何有效使用可变形的矢量场来描述心脏周期的基本动态过程,形式是派生的1D运动描述符。此外,基于收缩或放松心室的预期心血管生理特性,我们定义了一组规则,可以鉴定五个心血管阶段,包括末端 - 末端(ES)和末端diastole(ED),而无需使用标签的使用情况。我们评估了运动描述符在两个具有挑战性的多疾病, - 中心, - 扫描式短轴CMR数据集上的合理性。首先,通过报告定量措施,例如提取相的周期性框架差异。其次,通过定性地比较一般模式,当我们时间重新样本和对齐两个数据集的所有实例的运动描述符时。我们方法的ED,ES密钥阶段的平均周期框架差为0.80 \ pm {0.85} $,$ 0.69 \ pm {0.79} $,比观察者间的可变性略好($ 1.07 \ pm {0.86} $, $ 0.91 \ pm {1.6} $)和监督基线方法($ 1.18 \ pm {1.91} $,$ 1.21 \ pm {1.78} $)。代码和标签将在我们的GitHub存储库中提供。 https://github.com/cardio-ai/cmr-phase-detection
translated by 谷歌翻译
通过向每个数据示例添加校准的噪声来保护个人的隐私,差异隐私(DP)已成为保护个人隐私的黄金标准。尽管对分类数据的应用很简单,但在图像上下文中的可用性受到限制。与分类数据相反,图像的含义是相邻像素的空间相关性固有的,使噪声的简单应用不可行。可逆的神经网络(INN)表现出了出色的生成性能,同时仍提供量化确切可能性的能力。他们的原理是基于将复杂的分布转换为一个简单的分布,例如图像进入球形高斯。我们假设在旅馆的潜在空间中添加噪音可以实现差异化的私有图像修改。操纵潜在空间会导致修改的图像,同时保留重要的细节。此外,通过对数据集提供的元数据进行调节,我们旨在使对下游任务的尺寸保持重要意义,例如分类未触及的,同时更改其他可能包含识别信息的其他部分。我们称我们的方法意识到差异隐私(CADP)。我们对公共基准测试数据集以及专用医疗进行实验。此外,我们还展示了方法对分类数据的普遍性。源代码可在https://github.com/cardio-ai/cadp上公开获得。
translated by 谷歌翻译
多对象跟踪(MOT)是一项具有挑战性的任务,涉及检测场景中的对象并通过一系列帧跟踪它们。由于时间阻塞以及一系列图像序列的变化,评估此任务很困难。 Kitti等数据集上基准MOT方法的主要评估度量已成为高阶跟踪准确性(HOTA)度量,该指标能够更好地描述MOTA,DETA和IDF1等指标的性能。点检测和跟踪是一项密切相关的任务,可以将其视为对象检测的特殊情况。但是,评估检测任务本身(点距离与边界框重叠)存在差异。当包括时间维度和多视图方案时,评估任务变得更加复杂。在这项工作中,我们提出了一个多视图高阶跟踪指标(MVHOTA),以确定多点(多企业和多级)检测的准确性,同时考虑到时间和空间关联。 MVHOTA可以解释为检测,关联和对应准确性的几何平均值,从而为每个因素提供相等的权重。我们通过以前有组织的医疗挑战中的公开内窥镜检测数据集证明了用例。此外,我们与此用例的其他调整后的MOT指标进行比较,讨论MVHOTA的属性,并展示提出的对应准确性和闭塞指数如何促进对闭塞处理方法的分析。该代码将公开可用。
translated by 谷歌翻译
目的:二尖瓣修复是心脏瓣膜的复杂微创手术。在这种情况下,来自内窥镜图像的缝合线检测是一种高度相关的任务,该任务提供了分析缝合模式的定量信息,评估假肢配置并产生增强的现实可视化。面部或解剖标志性的检测任务通常包含固定数量的地标,并使用回归或固定的基于热线图的方法来定位标志性。然而,在内窥镜检查中,每个图像中存在不同数量的缝合线,并且缝合线可能发生在环形空中的任何位置,因为它们不是语义唯一的。方法:在这项工作中,我们将缝合检测任务制定为多实例的深热映射回归问题,以识别缝合线的进入和退出点。我们扩展了我们以前的工作,并介绍了一个新颖的使用2D高斯层,然后是可分辨率的2D空间软氩模层作为局部非最大抑制。结果:我们用多种热映射分布功能和所提出的模型的两个变体呈现广泛的实验。在术中帧内结构域中,变体1在基线上显示了+0.0422的平均f1。类似地,在模拟器域中,变体1在基线上显示了+0.0865的平均f1。结论:拟议的模型显示出在帧内和模拟器域中的基线上的改进。在Miccai Adaptor2021挑战HTTPS://Adaptor2021.github.io/的范围内公开可用,以及https://github.com/cardio-ai/suture-detection-pytorch/的代码。 DOI:10.1007 / S11548-021-02523-W。可以在此处找到与开放式接入文章的链接:https://link.springer.com/article/10.1007%2FS11548-021-02523
translated by 谷歌翻译
贝叶斯方法具有求解逆问题的有用属性,例如断层摄影重建。先前分布介绍了正则化,这有助于解决不良问题并减少过度装备。在实践中,这通常会导致次优的后温度和贝叶斯方法的全部潜力没有实现。在本文中,我们使用贝叶斯优化优化了先前分配和后温度的参数。脾气暴躁的后卫导致更好的预测性能和改进的不确定性校准,我们向稀疏视图CT重建的任务证明了这一点。
translated by 谷歌翻译
Affordance detection from visual input is a fundamental step in autonomous robotic manipulation. Existing solutions to the problem of affordance detection rely on convolutional neural networks. However, these networks do not consider the spatial arrangement of the input data and miss parts-to-whole relationships. Therefore, they fall short when confronted with novel, previously unseen object instances or new viewpoints. One solution to overcome such limitations can be to resort to capsule networks. In this paper, we introduce the first affordance detection network based on dynamic tree-structured capsules for sparse 3D point clouds. We show that our capsule-based network outperforms current state-of-the-art models on viewpoint invariance and parts-segmentation of new object instances through a novel dataset we only used for evaluation and it is publicly available from github.com/gipfelen/DTCG-Net. In the experimental evaluation we will show that our algorithm is superior to current affordance detection methods when faced with grasping previously unseen objects thanks to our Capsule Network enforcing a parts-to-whole representation.
translated by 谷歌翻译
角度分辨光发射光谱(ARPES)技术的最新发展涉及空间分辨样品,同时保持动量空间的高分辨率特征。这种开发很容易扩大数据大小及其复杂性以进行数据分析,其中之一是标记类似的分散剪辑并在空间上绘制它们。在这项工作中,我们证明了代表性学习(自我监督学习)模型的最新发展与K均值聚类相结合可以帮助自动化数据分析的一部分并节省宝贵的时间,尽管表现较低。最后,我们在代表空间中介绍了几次学习(k-nearest邻居或KNN),在该空间中,我们有选择地选择一个(k = 1)每个已知标签的图像参考,随后将其余的数据标记为最接近的参考图片。最后一种方法证明了自我监督的学习的强度,特别是在ARPE中自动化图像分析,并且可以推广到任何涉及图像数据的科学数据分析中。
translated by 谷歌翻译
在非洲使用的2,000多种语言几乎都没有广泛可用的自动语音识别系统,并且所需的数据也仅适用于几种语言。我们已经尝试了两种技术,这些技术可能为非洲语言提供大型词汇识别的途径:多语言建模和自我监督学习。我们收集了可用的开源数据并收集了15种语言的数据,并使用这些技术训练了实验模型。我们的结果表明,汇总多语言端到端模型中可用的少量数据,并预先培训无监督的数据可以帮助提高许多非洲语言的语音识别质量。
translated by 谷歌翻译